neurosymbolic transformer
Neurosymbolic Transformers for Multi-Agent Communication
We study the problem of inferring communication structures that can solve cooperative multi-agent planning problems while minimizing the amount of communication. We quantify the amount of communication as the maximum degree of the communication graph; this metric captures settings where agents have limited bandwidth. Minimizing communication is challenging due to the combinatorial nature of both the decision space and the objective; for instance, we cannot solve this problem by training neural networks using gradient descent. We propose a novel algorithm that synthesizes a control policy that combines a programmatic communication policy used to generate the communication graph with a transformer policy network used to choose actions. Our algorithm first trains the transformer policy, which implicitly generates a soft communication graph; then, it synthesizes a programmatic communication policy that hardens this graph, forming a neurosymbolic transformer. Our experiments demonstrate how our approach can synthesize policies that generate low-degree communication graphs while maintaining near-optimal performance.
Review for NeurIPS paper: Neurosymbolic Transformers for Multi-Agent Communication
Weaknesses: - The method relies on each agent having observations of other agents (o {i,j}). This seems like a very strong assumption, given that the motivation for this work was to lower the communication bandwidth necessary. The authors should comment on how this requirement could be weakened to allow scaling to more complex environments. The "loss" in Figure 2 is not clearly defined, and it would be much clearer to use "reward" as the y-axis in these Figures. The overlapping error bars in many of the results call into question the significance of the findings.
Review for NeurIPS paper: Neurosymbolic Transformers for Multi-Agent Communication
The paper proposes an approach for inferring the communication graph in multi-agent systems. It combines a gradient-based optimization with a discretization or "hardening" step. The method addresses a relevant problem, is reasonably well explained, and produces promising empirical results. In their initial reviews the reviewers expressed a number of concerns, these were, however, addressed at least in parts by the author response, and ultimately all reviewers recommend acceptance. One remaining caveat is the experimental evaluation which could be strengthened, e.g. by demonstrating that the approach works across a broader range of problems. Furthermore, the authors are strongly encouraged to incorporate the clarifications provided to the reviewers as part of the author response.
Neurosymbolic Transformers for Multi-Agent Communication
We study the problem of inferring communication structures that can solve cooperative multi-agent planning problems while minimizing the amount of communication. We quantify the amount of communication as the maximum degree of the communication graph; this metric captures settings where agents have limited bandwidth. Minimizing communication is challenging due to the combinatorial nature of both the decision space and the objective; for instance, we cannot solve this problem by training neural networks using gradient descent. We propose a novel algorithm that synthesizes a control policy that combines a programmatic communication policy used to generate the communication graph with a transformer policy network used to choose actions. Our algorithm first trains the transformer policy, which implicitly generates a "soft" communication graph; then, it synthesizes a programmatic communication policy that "hardens" this graph, forming a neurosymbolic transformer.